ApproxSeek: Web Document Search Using Approximate Matching
نویسنده
چکیده
Conventional search engines nd Web pages by using keyword matching. Two disadvantages for keyword matching are: (i) the order of keywords is ignored; and (ii) the approximate queries submitted by users are beyond the method of keyword matching can handle. Using keyword matching for Web page search is not satisfactory. This paper proposes a Web search system, named ApproxSeek, to facilitate Web information retrieval. The ApproxSeek achieves better search results by comparing the longest approximate common subsequences, which are the modi ed longest common subsequences having fault-tolerance capability. Experimental data show the proposed method improves the Web search, though more tests need to be conducted in order to support the conclusion.
منابع مشابه
Penerapan E-Service Berbasis Android pada Divisi Pelayanan Perbaikan Komputer CV Ria Kencana Ungu (RKU)
Archival information systems in government agency is one of the most used applications for daily acitivities. One feature in application management information document is searching. This feature serves to search for documents from a collection of available information based on keywords entered by the user. But some researches on a search engine (searching) concluded that the average user error...
متن کاملSistem Informasi Pengarsipan Menggunakan Algoritma Levensthein String pada Kecamatan Seberang Ulu II
Archival information systems in government agency is one of the most used applications for daily acitivities. One feature in application management information document is searching. This feature serves to search for documents from a collection of available information based on keywords entered by the user. But some researches on a search engine (searching) concluded that the average user error...
متن کاملTerm-specific eigenvector-centrality in multi-relation networks
Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem. This article investigates how eigenvector-centrality can be used for approximate matching in multirelation graphs, that is, graphs where connections of many di erent types may exist. Based on an extension of the PageRank matrix, e...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کامل